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Coronaviruses are a family of enveloped, plus-stranded 
RNA viruses with helical nucleocapsids and extraordinarily 
large genomes. The hallmark of coronavirus transcription is 
the production of multiple subgenomic mRNAs that contain 
sequences corresponding to both ends of the genome. (Tran- 
scription is defined as the process whereby subgenome-sized 
mRNAs are produced, and replication is the process whereby 
genome-sized RNA, which also functions as mRNA, is pro- 
duced.) Thus, the generation of subgenomic mRNAs involves 
a process of discontinuous transcription. The aim of this 
minireview is to describe our current understanding of coro- 
navirus replication and transcription. For more detailed infor- 
mation, the reader is directed to other recent reviews (25, 28, 
44a, 49, 63, 74, 94). 

The coronavirus genomic RNA of approximately 30,000 nu- 
cleotides encodes structural proteins of the virus, nonstructural 
proteins that have a critical role in viral RNA synthesis (which 
we will refer to as replicase-transcriptase proteins), and non- 
structural proteins that are nonessential for virus replication in 
cell culture but appear to confer a selective advantage in vivo 
(which we will refer to as niche-specific proteins). At least one 
niche-specific protein, nonstructural protein 2 (nsp2), and one 
structural protein, the nucleocapsid protein (N), are involved 
in viral RNA synthesis (1, 30, 68). 

The expression of the coronavirus replicase-transcriptase 
protein genes is mediated by the translation of the genomic 
RNA. The replicase-transcriptase proteins are encoded in 
open-reading frame la (ORFla) and ORF 1b and are synthe- 
sized initially as two large polyproteins, ppla and pplab. The 
synthesis of pplab involves programmed ribosomal frame 
shifting during translation of ORFla (48). During or after 
synthesis, these polyproteins are cleaved by virus-encoded pro- 
teinases with papain-like (PL?’°) and chymotrypsin-like folds 
into 16 proteins; nsp1 to nsp11 are encoded in ORFla, and 
nsp12 to nspl6 are encoded in ORF1b (Fig. 1) (96). The 
replicase-transcriptase proteins, together with other viral pro- 
teins and, possibly, cellular proteins, assemble into membrane- 
bound replication-transcription complexes (RTC). (We will 
use the term RTC to describe complexes copying or producing 
genome- or subgenome-length RNA.) These complexes accu- 
mulate at perinuclear regions and are associated with double- 
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membrane vesicles (15, 76). Hydrophobic transmembrane do- 
mains are present in nsp3, nsp4, and nsp6 and likely serve to 
anchor the nascent ppla/pplab polyproteins to membranes 
during the first step of RTC formation. 

Subcellular fractionation, electron microscopic in situ hy- 
bridization, and immunofluorescence studies suggest that 
most, if not all, coronavirus nsp proteins are recruited to RTCs 
synthesizing both genome- and subgenome-length RNA (29, 
32, 52, 76). At later times of infection, it appears that the nsp 
proteins encoded in ORF 1a remain tightly bound to the RTC, 
while proteins encoded in ORF1b detach and diffuse to the 
cytosol (85). In addition to the nsp proteins, and consistent 
with it playing a role in RNA synthesis, the RTC contains the 
viral N protein (13, 85). A number of cellular proteins have 
been shown to interact with coronaviral RNA. These include 
heterogeneous nuclear ribonucleoprotein Al, polypyrimidine 
tract binding protein, poly(A)-binding protein, and mitochon- 
drial aconitase (73), although there is no evidence that these 
proteins specifically colocalize with RTCs. 

Sequence analysis of the nsp1 to -16 proteins predicts that 
they have at least eight enzymatic activities (75). Some of these 
activities, e.g., proteinase, RNA-dependent RNA polymerase 
(RdRp), and 5’-to-3’ helicase (HEL) activities, are common to 
RNA viruses, but others appear to be unique to coronaviruses 
or viruses closely related to them. Clearly, while most of these 
enzymatic functions are concerned with viral RNA synthesis, 
some may also have relevance to cellular processes. For exam- 
ple, nsp3, which contains the PL?*® activity, is a deubiquitinat- 
ing enzyme and is capable of reversing the conjugation of 
protein with ISG15 (interferon-stimulated gene 15), which may 
subvert cellular processes to facilitate viral replication (7, 42). 
Also, the ADP-ribose 1'’-phosphatase (ADRP) activity of nsp3 
may act to influence levels of ADP-ribose, a key regulatory 
molecule in the cell. 


CORONAVIRUS RNA REPLICATION AND 
TRANSCRIPTION 


Model of discontinuous extension during subgenome-length 
minus-strand synthesis. We have proposed a model of discon- 
tinuous extension during subgenome-length minus-strand syn- 
thesis to explain the generation of coronavirus mRNAs (65) 
(Fig. 2). Each subgenome-length mRNA contains a 5’ leader 
sequence corresponding to the 5’ end of the genome. This 5’ 
leader is joined to a mRNA “body,” which represents se- 
quences from the 3’-poly(A) stretch to a position that is up- 
stream of each genomic ORF encoding a structural or niche- 
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FIG. 1. Organization and expression of the MHV-AS59 genome. The structural relationships of the MHV-A59 genome- and subgenome-length 
mRNAs are shown. The virus ORFs are depicted in teal (nsp1-nsp16 genes), blue (ns2, ns4a, ns4b, and nsSa genes), and green (S, M, E, N, and 
I structural protein genes). The ORFs are defined by the genomic sequence of MHV-AS9 as published by Coley et al. (20). The open red box 
represents the common 59-leader sequence, and the barred circle represents the programmed (—1) frameshifting element. The translation 
products of the genome- and subgenome-length mRNAs are depicted, and the autoproteolytic processing of the ORFla and ORFla/ORF1b 
polyproteins into proteins nsp1 to nsp16 is shown. A number of confirmed and putative functional domains in the nsp proteins are also indicated. 
NeU, uridylate-specific endoribonuclease; PL1, papain-like protease 1; PL2, papain-like protease 2. 


specific protein. The junction of the leader and body elements 
in each mRNA can be identified by a characteristic short, 
AU-rich motif of about 10 nucleotides that is known as the 
transcription-regulating sequence (TRS). In the genome, func- 
tional TRS motifs are found at the 3’ end of the leader (leader 
TRS) and in front of each ORF that is destined to become 5’ 
proximal in one of the subgenome-length mRNAs (body 
TRSs). 

Our explanation of how the subgenomic mRNAs are gener- 
ated has two central tenets: (i) that the process of discontinu- 
ous transcription occurs during the synthesis of minus-strand 
subgenome-length templates, and (ii) that the process of dis- 
continuous transcription resembles the mechanism of similar- 
ity-assisted or high-frequency copy-choice RNA recombina- 
tion. 

Mechanistically, the process of discontinuous transcription 
during minus-strand synthesis can be viewed as a number of 
consecutive events. (i) The components of a functional RTC 
are recruited and minus-strand synthesis is initiated at the 3’ 
end of a genomic RNA. (ii) Elongation of nascent minus- 
strand RNA continues until the first functional body TRS 
motif is encountered. A fixed proportion of RTCs will either 
(iii) disregard the TRS motif and continue to elongate the 
nascent strand or (iv) stop synthesis of the nascent minus 
strand and relocate and complete its synthesis. (v) This relo- 
cation will be guided by complementarity between the 3’ end of 


the nascent minus strand and the leader TRS motif. The trans- 
located minus strand will be extended by copying the 5’ end of 
a genome. The completed minus-strand RNA would then 
serve as a template for mRNA synthesis. The evidence to 
support this model is now extensive and has been discussed in 
detail in recent reviews (49, 63). 

There are many unanswered questions that are central to the 
model. 

First, we have only a vague idea of what constitutes the 
signal that stops, or attenuates, minus-strand synthesis at each 
of the body TRS motifs. Three parameters that are considered 
important are the stability of TRS base pairing between the 
template and nascent minus strand, the nature of sequences 
flanking the TRS motifs, and the location of the TRS motif 
relative to the promoter for minus-strand synthesis (21, 25, 77, 
92a, 97). Also, it is often proposed that protein-to-protein 
interactions may be important for attenuation but, in our opin- 
ion, the lack of rigid sequence specificity in the TRS base- 
pairing interaction implies that protein cofactors will not be 
major players in TRS recognition. Rather, specific RNA motifs 
(cis-acting and higher-order RNA structures) and their acces- 
sibility may be of importance, along with TRS base-pairing 
potential. Further evidence for the extremely tight regulation 
of body TRS attenuation is provided by a series of experiments 
done with the Equine arteritis virus (EAV) system; EAV be- 
longs to a family that is distantly related to coronaviruses, with 
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FIG. 2. Model for coronavirus replication-transcription. The ORF1 of genomic RNA (red) is translated to produce ppla and pplab, which 
assemble into an RTC (teal oval) that recognizes cis-acting elements at the 5’ and 3’ ends of the genome. This RTC copies the genome either 
continuously into genome-length template or discontinuously into the various subgenome-length minus-strand templates. The minus strands (blue) 
are used as templates for genomic and subgenomic mRNA synthesis. Only genomes are used as templates for minus-strand synthesis, i.e., 
replication. The RTCs engaging in plus-strand synthesis age and release their minus-strand templates, which are then degraded specifically. 


similar, if not identical, mechanisms of subgenome-length 
mRNA synthesis (49). In this system, it has been shown that 
the functional inactivation of a 3’-proximal body TRS element 
by minimal mutagenesis does not, as might have been pre- 
dicted, increase the frequency with which more 3’-distal body 
TRS elements are used (50). In other words, the TRS element 
still enforces attenuation, i.e., the “launching” of the RTC, 
even when the “landing” is prevented. 

Second, there is no information on how the relocation of the 
3’ end of the nascent minus strand to the 5’ end of the genomic 
template is mediated or how the realignment of complemen- 
tary bases on the template is facilitated. One idea is that this 
relocation might be mediated by protein-to-protein interac- 
tions between the polymerase attached to the growing minus 
strand and a protein associated with the 5’ end of the genome. 
It also could occur by the polymerase binding directly to the 
sequences downstream of the leader RNA. A variation on this 
theme is that the 5’ end of the genomic RNA is actually bound 
to the polymerase attached to the growing minus strand and 
essentially scans the newly synthesized RNA for the synthesis 
of the complementary TRS sequence (25). This is an attractive 
idea, but it implies that, at any one time, each genomic RNA 
template engages in the synthesis of only one subgenome- 


length minus strand, and it is contradictory to the idea that 
each template generates concurrently a complete set of ge- 
nome- and subgenome-length minus strands. The difference 
should be experimentally testable once sensitive methods are 
developed. With regard to the realignment of complementary 
bases on the template, it has been suggested that coronavirus 
polymerase might copy the template RNA in a fashion analo- 
gous to DNA-dependent RNA polymerases, where regulatory 
elements in the template and nascent RNA disrupt elongation 
(59, 89) and cause retraction at pause sites (40, 41) and where 
the polymerase remains associated with the growing nascent 
strand rather than with the template. This would endow the 3’ 
end of the nascent minus strand with primer capability. Of 
interest, a similar mechanism was recently proposed for proof- 
reading by RNA polymerases (71), and several gene products 
of ORF 1b of coronaviruses could plausibly have a role in this 
sort of process (see section below on proteins involved in 
coronavirus transcription). 

Kinetics of plus-strand and minus-strand synthesis. Coro- 
naviruses resemble other plus-stranded RNA viruses and pro- 
duce plus strands at 50- to 100-fold excess of their minus- 
strand templates. Based on studies with the A59 strain of 
Mouse hepatitis virus (MHV-A59), coronavirus RNA synthesis 
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is detectable as early as 2 to 3 h postinfection when metabolic 
labeling ([??P]orthophosphate or [?H]uridine) is employed and 
by 90 min postinfection when reverse transcription-PCR is 
used (63; data not shown). Throughout infection, syntheses of 
the various species of plus strands remain constant, i.e., in a 
fixed molar ratio, relative to each other. In addition, the same 
ratio is retained under conditions where overall viral RNA 
synthesis declines late in infection, after recovery from the 
inhibition of protein synthesis by cycloheximide, or after trans- 
fer from restrictive to permissive temperatures in the case of 
temperature-sensitive (ts) mutants. The pattern of plus-strand 
synthesis reflects a basic principle of coronavirus transcription; 
namely, the amount of genome- and subgenome-length 
mRNA synthesis is determined at the level of minus-strand 
synthesis. 

Minus-strand synthesis begins 75 to 90 min after infection. 
At this time, genome-length and subgenome-length minus 
strands and their plus-strand complements can be detected 
simultaneously by reverse transcription-PCR (S. Sawicki, un- 
published data). Therefore, once RTCs form, they immedi- 
ately start to produce minus-strand RNA. Minus strands accu- 
mulate exponentially, and their rate of synthesis peaks at 5 to 
6 h postinfection; their synthesis then declines but does not 
cease (62). Why the assembly of RTCs is hindered beyond 6 h 
postinfection is not known. 

Coronavirus RIs/TIs and RFs/TFs. The templates in viral 
RTCs are recoverable after deproteinization of infected-cell 
lysates as a multistranded replication intermediate (RI) with 
genome-length templates or multistranded transcription inter- 
mediates (TIs) with templates corresponding in length to sub- 
genome-length mRNAs and the similarly sized double- 
stranded replication form (RF) with genome-length templates 
and double-stranded transcription forms (TFs) with templates 
corresponding in length to subgenome-length mRNAs. The 
native RF/TF structures are essentially single-stranded, but 
they become largely double-stranded following deproteiniza- 
tion. The TIs and TFs are authentic transcription structures 
that are active in subgenomic mRNA synthesis (60, 64). 

In MHV-infected cells, all of the RTCs making minus and 
plus strands are labile. The short-lived nature of MHV minus- 
strand-synthesizing complexes is a constant feature throughout 
infection. It is demonstrated by inhibiting the synthesis of nsp’s 
with cycloheximide (62), which means that only newly made 
viral proteins function in minus-strand synthesis and suggests 
that polyprotein intermediates probably function in minus- 
strand synthesis. The instability of plus-strand-synthesizing 
RTCs is more difficult to observe but can be seen under certain 
conditions. For instance, at 7 to 8 h postinfection, plus-strand 
synthesis starts to decline, but it is difficult to study because 
with MHV-AS59 cell fusion also occurs. However, by use of a 
mutant of MHV-A59 that does not cause cell fusion (61), RNA 
synthesis can be seen to decline to undetectable levels by 12 to 
15 h postinfection. This occurs even though viral proteins, at 
least the structural proteins, are being produced at high levels. 
Another way to demonstrate the lability of plus-strand-synthe- 
sizing RTCs is to inhibit protein synthesis early in infection 
when plus-strand synthesis is increasing. Plus-strand synthesis 
stops increasing almost immediately, starts to decline an hour 
later, and disappears after 4 to 5 h. Under these circumstances, 
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the synthesis of all species of viral plus strands is equally 
affected. 

What is the basis of the instability shown by plus-strand- 
synthesizing RTCs? One clue is that RI/RF and TI/TF struc- 
tures disappear coincidently in time with the loss of plus-strand 
synthesis (63). This could reflect activation of specific RNase 
activities targeting minus strands. Our hypothesis is that coro- 
navirus RTCs age and lose their activity and then their minus- 
strand templates; the virus is then required to produce new 
RTCs throughout infection. 

Biogenesis of minus strands. The current model of corona- 
virus RNA synthesis proposes that minus strands arise from 
copying the genome continuously to form genome-length tem- 
plates and discontinuously to form subgenome-length tem- 
plates. While early experiments ruled out a role for splicing in 
the generation of viral subgenome-length mRNAs, it remained 
possible that the genome-length minus strands are subject to 
splicing, which would result in the formation of subgenome- 
length minus-strand templates with the same 5’ and 3’ ends. To 
test this possibility, we determined the sensitivity of MHV 
minus-strand synthesis to UV irradiation (data not shown). 
The kinetics of overall minus-strand sensitivity showed a dose 
dependency that was not linear and did not follow the inacti- 
vation of synthesis of any of the individual plus strands. Rather, 
it reflected the sum of their individual sensitivities. This result 
supports the argument that the subgenome-length minus- 
strand templates do not arise via splicing of genome-length 
minus strands. This conclusion is also supported by studies 
using replicons derived from the infectious clone of Human 
coronavirus, strain 229E (HCoV-229E) where, in the absence 
of N protein expression, the synthesis of a subgenome-length 
mRNA for green fluorescent protein (GFP) occurred even 
when there was little or no replication of the replicon-length 
RNA (68). Since the subgenome-length mRNA encoding GFP 
had a leader fused to its 5’ end and was produced in the 
absence of demonstrable replicon-length minus-strand synthe- 
sis, discontinuous synthesis rather than splicing was likely 
responsible for the generation of the minus-strand template. 

Although they are central to any understanding of corona- 
virus transcription, fundamental issues regarding the mecha- 
nisms of the RTCs responsible for minus-strand synthesis 
remain to be addressed. The most obvious remaining question 
is the mechanism of discontinuous synthesis. However, it is 
also important to ask other questions. Can a single genome 
serve as a template for the synthesis of only one minus strand, 
or can it serve simultaneously for all species of minus strands? 
Is the ratio of each minus-strand product per genome 1 or 
greater than 1? We believe that the fixed ratio of different 
minus strands produced early and late during infection sug- 
gests that each genome serves as a template for the complete 
nested set of minus strands, with cis-active signals and the 3’ 
position relative to the genome influencing or determining the 
final abundance or molar ratios of each product. This predic- 
tion can be tested, however, only when a sensitive method is 
devised to analyze the templates and products of RTCs en- 
gaged in minus-strand synthesis. 

Promoters for minus- and plus-strand synthesis. It is diffi- 
cult to reconcile experiments investigating the sequences and 
structures found at the 5’ and 3’ ends of coronavirus RNA with 
simple models for the initiation of plus- and minus-strand 
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TABLE 1. Enzymatic activities and characteristics of coronavirus nsp protein domains 
Activity ssid ene Protein family Architecture Comments References 
esignation 
Papain-like proteinase One or two PLP*® Deubiquitinating Finger-palm-thumb Fingertips contain zinc- 5, 33, 56 
domains within nsp3 enzyme family binding domain 
ADP-ribose 1"-phosphatase ADRP (or X) domain Macro-HS2A-fold Three-layered o/B/a Also found in other plus- 53, 58 
within nsp3 family strand RNA viruses 
3C-like cysteine proteinase nsp5 (3CLP*° or main Two-B-barrel proteinase Twelve antiparallel Extra, a-helical, carboxyl- 3, 4, 95 
protease, MP"°) family B-strands terminal domain III 
RNA-dependent RNA RdRp domain within Viral RdRp family Finger-palm-thumb RdRp motif VI signature 19, 87 
polymerase nsp12 (predicted) GDD replaced with SDD 
5’-to-3' helicase (associated, HEL domain within Superfamily 1 helicase Modeled to Adjacent to amino- 9, 35, 36, 70 
NTPase, and RNA 5’- nsp13 Escherichia coli proximal, binuclear, zinc- 
triphosphatase activities) Rep helicase binding domain 
3'-to-5’ exonuclease ExoN domain within DEDD superfamily Hexahelical bundle Unable to cleave ribose 45, 75 
nsp14 (predicted) 2'-O-methylated RNA 
substrates 
Uridylate-specific NendoU domain within XendoU family Wing-body-wing Active in hexameric form; 10, 11, 31, 34, 
endoribonuclease nsp15 (butterfly fold) RNA bound to internal 56a, 88 
cavi 
S-adenosylmethionine- nsp16 (MT) Rrmj methyltransferase Methyltransferase ie yet to be 75, 86 
dependent 2’-O- family fold (predicted) determined 


methyltransferase 


synthesis. Two regions of the 3’ untranslated region (UTR) 
have been suggested as containing cis-acting regulatory ele- 
ments that play a role in coronavirus RNA synthesis. The first 
region of ~150 nucleotides adjoins the poly(A) stretch and is 
predicted to form a number of different stem-loop structures 
(37, 57). One of these is known as the stem-loop II motif and 
has been identified in the Severe acute respiratory syndrome 
coronavirus (SARSCoV) genome (57). This region also con- 
tains the sequence 5’-GGAAGAGC-3’ that is present in all 
coronavirus genomes. The second region is upstream of the 
terminal 150 nucleotides and contains two structures, known as 
the bulged stem-loop and the hairpin-type pseudoknot. It is 
interesting that the pseudoknot structure involves nucleotides 
at the base of the bulged stem-loop structure, which means the 
structures are mutually exclusive. This may represent a form of 
“molecular switch” related in an as-yet-unknown way to dif- 
ferent modes of RNA synthesis (27). Structures implicated in 
RNA synthesis have also been predicted in the 5’ UTR of the 
coronavirus genome (14, 54, 55). These include four higher- 
order structures, stem-loops I to IV, that may function as 
cis-acting elements in RNA replication and transcription. 
Stem-loops I and III demonstrate a cis-acting function in 
Bovine coronavirus (BCoV) defective interfering (DI) RNA 
replication. Stem-loop II harbors the leader TRS motif, and 
stem-loop IV may be involved in the initiation of minus-strand 
synthesis (55). 

In a model analogous to the models for picornavirus repli- 
cation-transcription (8), we propose that the 3’ and 5’ ends of 
the coronavirus genome interact, either directly (RNA to 
RNA) or indirectly (protein to RNA or protein to protein), to 
form the promoter for minus-strand synthesis. Only genomes 
containing a 5’ element downstream of the leader would be 
able to engage the 3’ end to serve as templates for minus- 
strand synthesis. The subgenome-length mRNAs would be 
missing the 5’ element (although they would all contain the 3’ 
element), and this would explain why they are not able to 
replicate (43, 54, 55). Similarly, the 3’ end of minus strands 
presumably must function as part of the promoter for plus- 
strand synthesis. Again, we would propose that the 3’ and 5’ 


ends of the minus strand might interact to circularize the tem- 
plate and create functional plus-strand promoters. DI minus 
strands are capable of replicating when introduced into helper 
virus-infected cells (38), and this indicates that minus-strand 
templates are capable of being recruited in trans into a func- 
tional RTC to produce plus strands. Thus, either the minus- 
strand templates of RTCs are released after completion of 
synthesis of the plus strand, allowing the RTC to initiate RNA 
synthesis on another minus-strand template, or newly assem- 
bled RTCs might recognize minus strands in trans and not be 
obliged to first create a minus-strand template by copying a 
genome. 


PROTEINS INVOLVED IN CORONAVIRUS 
TRANSCRIPTION 


As mentioned above, sequence analyses of the nsp proteins 
of MHV, SARSCoV, and other coronaviruses predict that they 
include domains associated with at least eight enzymatic activ- 
ities (75). The enzymatic activities and characteristics of these 
protein domains are listed in Table 1. Crystallographic or nu- 
clear magnetic resonance structures have been determined for 
the PL?*° and ADRP (or X) domains of nsp3, nsp5, nsp7, nsp8, 
nsp9, and nsp10 (3, 39, 51, 56, 58, 82, 93). In addition, a 
number of other features, including domains with conserved 
cysteine and histidine residues (C/H domains in nsp3, nsp13, 
and nsp14), putative transmembrane domains (TM domains in 
nsp3, nsp4, and nsp6), and domains with conserved features 
(the A [acidic] and Y domains within nsp3) have also been 
identified in coronavirus nsp proteins (94). 

In addition to the proteins for which enzymatic functions 
have been predicted or demonstrated, the structures of four 
other coronavirus nsp proteins have been reported. The 
SARSCoV proteins nsp7 and nsp8 cocrystallize to produce a 
supercomplex that can be viewed as a hollow, cylinder-like 
structure assembled from eight copies of nsp8 and eight copies 
of nsp7. This complex has a central channel with positive elec- 
trostatic properties favorable for nucleic acid binding. It has 
been suggested that the role of this structure may be to confer 
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processivity to the viral RdRp (51, 93). Second, the crystal 
structure of SARSCoV nsp9 has been reported as comprising 
a single B-barrel with a fold that resembles a carboxyl-extended 
oligonucleotide-oligosaccharide binding (OB) fold. The crystal 
structure suggests that the protein is dimeric, and gel shift 
assays show that it is able to bind single-stranded RNA (24, 
82). And finally, recent studies have elucidated the crystal 
structure of the SARSCoV nsp10 protein (39). The structure 
predicts a single domain protein consisting of a pair of antipa- 
rallel amino-terminal helices stacked against an irregular 
B-sheet, together with a coil-rich carboxyl terminus and two 
zinc fingers. Twelve subunits assemble to form a unique do- 
decameric superstructure. The nsp10 protein is the first repre- 
sentative of a new family of zinc-finger proteins which, so far, 
are found exclusively in coronaviruses (39, 81). 

Many review articles on coronavirus RNA synthesis include 
references to a nonstructural protein, ns2, which is encoded in 
the genome of a subset of group II coronaviruses, including 
MHV, BCoV, and HCoV-OC43. The ns2 gene lies immedi- 
ately downstream of ORF1b, and the protein is translated from 
a subgenome-sized mRNA. This protein has been predicted to 
have cyclic phosphodiesterase (CPD) activity, although this has 
not yet been experimentally verified. It is tempting to envisage 
this activity, together with the ADRP activity of nsp3, as acting 
in the regulation of a pathway involving the processing of 
(viral) RNA by enzymes (for example, the NendoU activity of 
nsp15) that generate intermediates with terminal cyclic phos- 
phate residues (78). It is worth remembering, however, that 
only a few coronaviruses encode this protein, that it is not 
essential for virus replication in cell culture (69), and that there 
is no evidence to suggest that it is recruited into the corona- 
virus RTC. 


GENETICS OF CORONAVIRUS TRANSCRIPTION 


Forward genetics. The classical approach to the genetic 
analysis of coronavirus replication and transcription, i.e., the 
characterization of conditionally lethal, usually ts, virus mu- 
tants that are unable to synthesize RNA when the infection is 
initiated and maintained at the nonpermissive temperature, 
has been adopted in only a few laboratories. This is disappoint- 
ing, because the essential feature of such mutants is that they 
are likely to be defective in different aspects of viral RNA 
synthesis, and a detailed characterization of their genotype and 
phenotype should provide insights into the mechanisms of 
RNA synthesis, the functions of individual viral replicase pro- 
teins, and the protein-RNA and protein-protein interactions 
that regulate the activity of the RTC. These conditional-lethal 
mutants may also be used in a cis-trans test to define the 
number of complementation groups, or cistrons, that contrib- 
ute to a specific phenotype. This sort of analysis can provide 
valuable insight into the possible pathways that polyproteins 
must travel to assume functional configurations. 

MHV-A59 mutants have been produced in a number of 
laboratories over a period of 20 years (67, 79, 80). In a recent 
analysis of 19 ts mutants that are unable to produce viral RNA 
at 39°C, the nonpermissive temperature, four complementa- 
tion groups were identified using classical and molecular 
complementation assays (66). Most of the 19 mutants, includ- 
ing Alb ts6, Alb ts16, and LA ts6, were members of comple- 
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mentation group 1. The lesions in these three mutants were 
located in nsp4, nsp5, and nsp10. Thus, by definition, these 
individual nsp’s are cis-acting, i.e., their ts defects cannot be 
complemented in trans by nsp’s made by a second virus. This 
result suggests three possibilities: (i) nsp4-nsp10 function as a 
polyprotein before cleavage into individual polypeptides, (ii) 
nsp4-nsp10 first assemble into an RTC and are then cleaved, 
with a gain of function(s) expressed in individual polypeptides, 
or (iii) ¢s-induced alteration to the folding or interaction of 
protein domains within the nsp4-nsp10 region of ppla/pplab 
interferes with polyprotein processing or function. In the same 
study, five further mutants could be assigned to three addi- 
tional complementation groups. Alb ts22 (mutation in nsp12) 
comprises group 2, Alb ts17 and Wii ts38 (mutations in nsp14) 
comprise group 4, and Wii ts18 and Wii ts36 (mutation in 
nsp16) comprise group 6. Thus, each of these nsp’s is trans- 
active and diffusible between pplab complexes. Taken to- 
gether, the results of this analysis are consistent with the idea 
that nsp4 to nsp10 of ppla act together as a complex, multido- 
main structure or scaffold onto which the trans-acting nsp’s 
(e.g., nsp12, nsp14, and nsp16) and viral RNA associate. It will 
be interesting to discover whether other complementation 
groups can be assigned to nspl, nsp2, nsp3, nsp13, or nsp15 
and if any of the individual nsp’s in nsp4 to -10 have later and 
separate functions, i.e., are trans-acting. This information will 
be especially interesting for nsp’s in ppla that are thought to 
function as multimers. 

Another major advantage of using ¢s mutants is that they can 
be analyzed by temperature-shift protocols; i.e., by allowing 
them to gain function at the permissive temperature, shifting 
to the nonpermissive temperature, and then determining if the 
function is lost. Generally, their failure to lose function after 
the temperature shift indicates that the amino acid change in 
the mutant has affected the formation of the complex or ac- 
tivity. If they lose function, the mutation probably directly 
affects function. When this analysis was done for the mutants 
described above, we were surprised to find that MHV-A59 
mutants from cistrons 2, 4, and 6 were of the latter type. In one 
case, Alb ts22, where the mutation lies in the virus RdRp, it is 
easy to rationalize this phenotype. However, we also found that 
mutations in nsp14 and nsp16 have an immediate effect on 
plus-strand RNA synthesis by preformed RTCs after the shift 
to 39°C, showing that the activity of the MHV-A59 RTC is 
dependent on a complex structure of interacting domains con- 
tributed by nsp’s encoded in ORF1Db. In contrast, mutations in 
nsp4, nsp5, and nsp10 (cistron 1, i.e., Alb ts6, Alb ts16, and LA 
ts6) did not inhibit ongoing plus-strand synthesis following the 
temperature shift. This result is consistent with the notion that 
the nsp4-to-nsp10 portion of ppla is providing structural and 
not catalytic components of the complex (see Addendum in 
Proof). 

An RNA-negative phenotype, i.e., the failure to synthesize 
viral RNA at the nonpermissive temperature, might be due to 
the inability to form a minus-strand RTC or the inability to 
convert the minus strand into a RTC-making plus strand (the 
so-called conversion phenotype) (22). A more detailed analysis 
of the cistron 1 mutants showed that mutant LA ts6 was de- 
fective in continuing minus-strand synthesis after the shift to 
the nonpermissive temperature. This result implicates nsp10 in 
minus-strand synthesis. In contrast, Alb ts16 appeared to have 
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a conversion phenotype: Alb fs16-infected cells made minus 
strands but did not increase the rate of plus-strand synthesis 
after shift to 39°C. Alb ts16 has a mutation in the carboxyl- 
terminal domain of nsp5 (3C-like cysteine proteinase 
[3CLP*°]), and it will be important to determine if this muta- 
tion affects the activity of the proteinase (72, 84). It might 
affect the folding of ppla or pplab, or the nsp5 C-terminal 
domain itself could have a function in plus-strand RNA syn- 
thesis. Nevertheless, because minus-strand RNA synthesis 
ceased in Alb ts16-infected cells that were treated with cyclo- 
heximide to block translation of new nsps at the time of the 
shift to nonpermissive temperature (66), we propose that the 
Alb ts16 RTC does not retain artifactually its activity for minus- 
strand synthesis. Rather, it fails to gain plus-strand synthesis 
activity at the nonpermissive temperature. We favor a model 
where the activity that makes plus strands is gained at the 
expense or loss of the activity to make minus strands. 

Reverse genetics. The development of infectious cDNA con- 
structs for coronaviruses is a recent event, and the use of 
reverse genetics in the analysis of coronavirus transcription is 
now gathering momentum. Already, a number of important 
findings have been made. First, reverse genetics, in particular 
the method of targeted RNA recombination (44), has been 
used to analyze cis-acting elements that regulate coronavirus 
transcription, as discussed in the above section on coronavirus 
RNA replication and transcription. Second, reverse genetics 
has been used to confirm the involvement of coronavirus nsp 
proteins in RNA synthesis and to discriminate between essen- 
tial and nonessential functions (6, 26, 83). For example, mu- 
tation at the active site of the HCoV-229E nsp15 NendoU 
domain or the active site of the HCoV-229E nsp14 3’-to-5’ 
exonuclease (ExoN) domain suggests that these functions are 
essential for virus RNA synthesis in cell culture (34) although, 
in some cases, residual viral RNA synthesis was detected in 
cells transfected with mutant RNA. In contrast, mutation of 
the active site of the HCoV-229E nsp3 ADRP domain had no 
significant effect on virus RNA synthesis or virus titer, and no 
reversion to wild-type sequence was observed when the mutant 
virus was passed in cell culture (53). 

Reverse genetics has been used to show that the MHV and 
SARSCoV nsp2 proteins are not essential for virus replication 
but that their deletion attenuates virus growth and virus RNA 
synthesis (30). In a similar fashion, MHV mutants that are 
incapable of liberating nsp1 from the nascent polyprotein (i.e., 
nspl/nsp2 cleavage mutants) exhibit delayed replication, low 
titers, small plaque morphology, and reduced RNA synthesis 
compared to wild-type virus (23). A more extensive deletion 
and site-specific mutagenesis study of the MHV nsp1 has iden- 
tified domains and residues that are important for polyprotein 
processing and virus RNA synthesis (16). Although informa- 
tive, this sort of reverse genetic analysis must be approached 
with caution. The frequency of lethal mutation is likely to be 
high, and there is, as always, the possibility of indirect effects, 
such as phenotypes caused by the disruption of cis-acting RNA 
structures. 


ROLE OF THE N PROTEIN 


The coronavirus N protein has been implicated in virus 
RNA synthesis by three lines of evidence. First, the N protein 
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was shown to bind specific RNA sequences, including the 
leader sequence, the TRS sequence, and sequences located at 
the 3’ end of the virus genome (73). Second, in addition to a 
cytoplasmic distribution in the host cell, at least a fraction of 
the N protein colocalizes with RTCs early in infection (13, 85). 
And third, there is clearly a requirement for sustained trans- 
lation of the N protein in trans or in cis for optimal replication 
of BCoV DI RNA and Transmissible gastroenteritis virus-de- 
rived replicons (1, 18). The importance of the N protein was 
also emphasized when it was shown that the rescue of recom- 
binant coronaviruses from cells transfected with infectious 
RNA was greatly enhanced by, if not dependent upon, the 
expression of N protein in the same cell (2, 17, 20, 90-92). 

These observations were followed up by experiments show- 
ing that, in the absence of N protein, HCoV-229E replicons 
expressing GFP from a subgenomic mRNA produced low lev- 
els of GFP and little or no amplification of the replicon. In 
contrast, the expression of N in the same cells, in cis or in trans, 
significantly increased the number of transduced cells, the lev- 
els of GFP expression and, concomitantly, the amplification of 
the replicon (68). Our interpretation of these data is that the N 
protein has a role in determining the ratio of genome- to 
subgenome-length minus strands and, in the absence of N, this 
ratio is perturbed, with the underproduction of genome-length 
templates. In the virus-infected cell, there should always be 
sufficient N protein, either introduced with the nucleocapsid 
structure or synthesized within 90 min of infection (see the 
section on coronavirus RNA replication and transcription), to 
fulfill this role. Therefore, according to this interpretation, N 
does not function as a replication-transcription switch. 

What could the role of the N protein be? One possibility is 
that it is acting as an RNA chaperone as proposed for the N 
protein of hantaviruses (46), where chaperone activity results 
in the transient dissociation of RNA structures, including RNA 
duplexes with adjoining single-stranded regions, which may be 
required for facilitating correct higher-order RNA structure. 
For coronaviruses, such chaperone activity could be important 
for the initiation of minus-strand synthesis or, perhaps, for 
suppressing attenuation and/or template switching at the TRS 
element during discontinuous synthesis (25). Second, corona- 
viruses possess helical nucleocapsids. This is not common 
among plus-stranded RNA viruses, Tobacco mosaic virus being 
another exception, but is typical of minus-stranded RNA vi- 
ruses. It is possible that the role of the coronavirus N protein 
is to associate with the genomic RNA to produce a template 
that is “configured” to balance the ratio of RTCs engaged in 
either transcription or replication, as has been proposed for 
measles virus (12). In this respect, it is also worth noting that 
replication and transcription from the genome of EAV, a virus 
that has an icosahedral nucleocapsid structure, does not ap- 
pear to involve N protein function (47). 


WHY IS CORONAVIRUS TRANSCRIPTION 
SO COMPLEX? 


Two reasons are most often given for the complexity of 
coronavirus transcription. (i) The generation of multiple sub- 
genomic mRNAs and the process of discontinuous transcription 
during minus-strand synthesis demand complex replication-tran- 
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scription machinery. (ii) The large size of the coronavirus genome 
demands unusual activities to maintain genetic stability. 

With regard to the first reason, it is true that the generation 
of multiple minus-strand RNAs from a single template must be 
a complex process but, essentially, it is only the process of 
discontinuous transcription during minus-strand synthesis that 
is unique. The initiation of plus-strand and minus-strand syn- 
thesis, elongation, and termination are carried out by all RNA 
viruses, usually with a smaller number of replicase components 
than are found in the coronavirus RTC. Also, it is important to 
realize that arteriviruses, which, as already mentioned, are a 
family of viruses related to coronaviruses and have similar, if 
not identical, mechanisms of subgenome-length mRNA syn- 
thesis, do not encode proteins with functions analogous to the 
ExoN, S-adenosylmethionine-dependent 2’-O-methyl trans- 
ferase (MT), ADRP, or CPD functions of coronaviruses (75). 
This implies that these functions, although they may be an 
integral part of the coronavirus RTC, are not needed to allow 
for discontinuous transcription. Biological advantages to dis- 
continuous transcription include economies of coding and the 
ability to regulate individual mRNA abundance, but these do 
not seem to justify the extraordinary genetic investment that 
coronaviruses have made in replicase-transcriptase and niche- 
specific proteins. 

An alternative view is proposed by Gorbalenya and col- 
leagues (28). Basically, these authors argue that the acquisition 
of the coronavirus enzymatic activities may have improved the 
fidelity of RNA replication and transcription to allow for ge- 
nome expansion. This, in turn, would provide coronaviruses 
with the opportunity, for example, to expand their host range 
and adapt rapidly to changing environmental conditions while 
maintaining genomic stability. Specifically, it is proposed that 
the HEL, ExoN, NendoU, and MT functions may provide 
RNA specificity and that the ADRP and CPD functions (when 
present) modulate the pace of a reaction in a common path- 
way, which could be part of an oligonucleotide-directed repair 
mechanism (75). The existence of such a repair mechanism in 
coronaviruses, with similarities to the “proofreading” or repair 
activities associated with DNA replication, would require a 
paradigm shift in our view of RNA virus replication. 
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ADDENDUM IN PROOF 
It was recently shown that SARSCoV encodes a second, non- 
canonical RdRp residing in nsp8, and it was proposed that the 
nsp8 RdRp produces primers utilized by the primer-dependent 
nsp12 RdRp (I. Imbert, J. C. Guillemot, J. M. Bourhis, C. Bus- 
setta, B. Coutard, M. P. Egloff, F. Ferron, A. E. Gorbalenya, and 
B. Canard, EMBO J. 25:4933-4942, 2006). 
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